Extreme k-Center Clustering

نویسندگان

چکیده

Metric clustering is a fundamental primitive in machine learning with several applications for mining massive datasets. An important example of metric the k-center problem. While this problem has been extensively studied distributed settings, all previous algorithms use Ω(k) space per and Ω(n k) total work. In paper, we develop first highly scalable approximation algorithm clustering, O~(n^ε) O~(n^(1+ε)) work, arbitrary small constant ε. It produces an O(log log n)-approximate solution k(1+o(1)) centers n) rounds computation.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

k-Center Clustering Under Perturbation Resilience

The k-center problem is a canonical and long-studied facility location and clustering problem with many applications in both its symmetric and asymmetric forms. Both versions of the problem have tight approximation factors on worst case instances: a 2-approximation for symmetric kcenter and an O(log*(k))-approximation for the asymmetric version. Therefore to improve on these ratios, one must go...

متن کامل

Balanced k-Center Clustering When k Is A Constant

The problem of constrained k-center clustering has attracted significant attention in the past decades. In this paper, we study balanced k-center cluster where the size of each cluster is constrained by the given lower and upper bounds. The problem is motivated by the applications in processing and analyzing large-scale data in high dimension. We provide a simple nearly linear time 4-approximat...

متن کامل

An Efficient Implementation of the Robust k-Center Clustering Problem

The standard k-center clustering problem is very sensitive to outliers. Charikar et al. proposed an alternative algorithm to cluster p points out of n total, thereby avoiding the distortion caused by outliers. The algorithm has an approximation bound of three times the true solution, but is very slow if implemented naively. We propose a modified implementation of the algorithm that runs signifi...

متن کامل

Symmetric and Asymmetric $k$-center Clustering under Stability

The k-center problem is a canonical and long-studied facility location and clustering problem with many applications in both its symmetric and asymmetric forms. Both versions of the problem have tight approximation factors on worst case instances: a 2-approximation for symmetric k-center and an O(log∗(k))-approximation for the asymmetric version. Therefore to improve on these ratios, one must g...

متن کامل

Cluster center initialization algorithm for K-modes clustering

Partitional clustering of categorical data is normally performed by using K-modes clustering algorithm, which works well for large datasets. Even though the design and implementation of K-modes algorithm is simple and efficient, it has the pitfall of randomly choosing the initial cluster centers for invoking every new execution that may lead to non-repeatable clustering results. This paper addr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2021

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v35i5.16513